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Abstract 

For some arenaviruses, cellular entry into a host species may involve the interaction of 
the host transferrin receptor 1 gene (TfRl) and the arenaviral glycoprotein. To examine this 
scenario, a 42 bp nucleotide region of the TfRl gene, surrounding five conserved amino acid 
residues (208-212) and the apical domain, was sequenced in host species of North American are¬ 
naviruses. The goal was to determine if this region: 1) differed between infected and uninfected 
individuals of the same host; 2) was polymorphic in a species known to host three genetically 
divergent strains of a single arenavirus; 3) varied in different rodent species infected with the 
same arenavirus; and 4) varied between host species and non-host species. Phylogenetic and 
genetic distance analyses revealed that nucleotides and amino acids were conserved regardless 
of the host species, arenavirus, or whether a host was infected or uninfected. Results indicated 
that the TfRl may not be involved in cellular transportation of North American arenaviruses. 
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Introduction 


Viruses of the family Arenaviridae, genus Are¬ 
navirus , are single-stranded ambisense RNA viruses 
associated with lymphocytic choriomeningitis and 
various hemorrhagic fevers in humans (Oldstone 2002; 
Salvato et al. 2005; Milazzo et al. 2011). The genus 
Arenavirus contains at least 26 species and comprises 
two genetically and geographically distinct groups 
(Charrel and de Lamballerie 2003): Lassa-lymphocytic 
choriomeningitis serocomplex (Old World arenavi¬ 
ruses) and Tacaribe serocomplex (New World arena- 
viruses). The Tacaribe serocomplex is divided into 
three South American clades: A (Allpahuayo, Flexal, 
Parana, Pichinde, and Pirital); B (Amapari, Cupixi, 


Guanarito, Junin, Machupo, Sabia, and Tacaribe); and 
C (Latino and Oliveros); and a North American clade 
containing Bear Canyon, Big Brushy Tank, Catarina, 
Real de Catorce, Skinner Tank, Tamiami, Tonto Creek, 
Whitewater Arroyo, and other Whitewater Arroyo-like 
viruses (Bowen et al. 1997; Charrel et al. 2001; Fulhorst 
et al. 2001; Salvato et al. 2005; Cajimat et al. 2007a, 
b, 2008, in press; Milazzo et al. 2008, 2010; Inizan et 
al. 2010). 

The Old World arenaviruses are thought to be 
associated with rodent species in the family Muridae, 
whereas, New World arenaviruses, with the exception 
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of the fruit bat Artibeus jamaicensis (principal host of 
the Tacaribe virus - Downs et al. 1963) are associated 
with members of the rodent family Cricetidae (Cajimat 
et al. in press). North American arenaviruses, in par¬ 
ticular, appear to use members of the rodent subfamily 
Neotominae as their principal hosts, whereas South 
American arenaviruses are associated with the subfam¬ 
ily Sigmodontinae. 

In order to gain access into a host cell, arenavirus¬ 
es produce a non-covalently linked surface glycoprotein 
precursor (GPC) that proteolytically is cleaved into the 
glycoproteins GP1 and GP2 (Buchmeier et al. 1987; 
Southern 1996). GP1 mediates arenavirus association 
with a host cellular receptor and GP2 mediates fusion 
of the host and viral cellular membranes (Kunz et al. 
2005a). Previous studies (Cao et al. 1998; Spiropoulou 
et al. 2002; Kunz et al. 2005a) indicated that Old World 
and South American clade C viruses use the host surface 
cell protein a-dystroglycan as a cellular receptor to 
interact with the arenaviral GP1. Recently, the surface 
protein transferrin receptor 1 (7^1) was identified as 
the receptor for several pathogenic and non-pathogenic 
clade B viruses (Radoshitzky et al. 2007; Flanagan et 
al. 2008; Abraham et al. 2009); although, Flanagan et 
al. (2008) suggested that the non-pathogenic clade B vi¬ 
ruses might use other, unknown receptors. In addition, 
Radoshitzky et al. (2008) identified five amino acid 
residues (208-212) in the apical domain of the human 
transferrin receptor gene (hTJRl) as the binding site 
for the GP1 protein. Alteration of only a few of these 
residues resulted in TfRl supporting the cellular entry 
of otherwise non-pathogenic arenaviruses (Radoshitzky 
et al. 2008; Abraham et al. 2009). In particular, the 
presence of a tyrosine at residue 211 is suspected to 
affect the efficiency of arenaviral entry into a host cell 
(Radoshitzky et al. 2008). 

To date, no cellular receptor has been identified 
for North American arenaviruses, although it has been 
determined they do not use a-dystroglycan (Reignier et 
al. 2008), as do the clade C viruses. In addition, Reigner 
et al. (2008) in a series of binding and inhibition assays 
demonstrated that the White Water Arroyo virus does 
not require hTJRl to enter cells. However, it remains 
possible that some North American arenaviruses use 
TfRl as a receptor. This hypothesis is supported by 


the fact that North American arenaviruses are phylo- 
genetically most similar to South American clade B 
viruses (Charrel et al. 2001, 2008; Salvato et al. 2005; 
Cajimat et al. 2007a, 2008, in press; Milazzo et al. 
2008, 2010). In most of those studies, the GPC gene 
(particularly the GP1 fragment) was one of the primary 
genetic markers for ascertaining the phylogenetic rela¬ 
tionships. If TfRl and GP1 binding is responsible for 
North American arenavirus entry into host cells, then 
the two genes should evolve in a similar fashion with 
nucleotide sequence conservation or evolution in one 
being mimicked by the other. If TJRl is not the host 
receptor for North American arenaviruses, then it would 
not be subjected to selective pressure by arenaviruses 
and instead would be constrained by its iron-binding 
function (Richardson and Ponka 1997), resulting in a 
higher degree of sequence conservation. 

Based on recent studies of nucleotide and amino 
acid sequences of GP1 among North American are¬ 
naviruses (Inizan et al. 2010; Cajimat et al. in press), 
where high levels of genetic divergence were reported 
(in some cases as high as 39% and 42%, respectively), 
it would appear that high levels of genetic divergence 
would be predicted for the TfRl receptor in hosts of 
North American arenaviruses. Also, given that most 
New World human pathogens have been associated 
with the clade B viruses (Flanagan et al. 2008), it is 
paramount to determine if the TfRl receptor in hosts 
of North American arenaviruses possess the same 
conserved amino acid residues as the clade B viruses. 
Therefore, the objectives of this study were four¬ 
fold. First, determine if TfRl nucleotide sequences 
vary between infected and uninfected individuals of 
Neotoma albigula from populations where the White 
Water Arroyo arenavirus infection rate ranged from 0 
to 40% (Abbott et al. 2004). Second, determine if 7^1 
receptors are polymorphic in Neotoma micropus , a spe¬ 
cies known to host three genetically divergent strains 
of the Catarina arenavirus (Fulhorst et al. 2002b). 
Third, determine if TfRl receptors vary in rodent spe¬ 
cies ( Neotoma macrotis and Peromyscus californicus) 
infected with the Bear Canyon arenavirus (Fulhorst et 
al. 2002a; Cajimat et al. 2007b). Fourth, determine if 
TfRl sequences vary between hosts {Neotoma spp.) 
and non-hosts (other species of the Neotominae and 
Sigmodontinae). 
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Methods 


Sampling and collecting localities. —Specimens 
were collected from natural populations during previ¬ 
ous studies of host-arenavirus relationships (Fulhorst et 
al. 2001,2002a; Abbot et al. 2004; Cajimat et al. 2007a, 
b, 2008, in press; Milazzo et al. 2008, 2010; Inizan et 
al. 2010); consequently, it was known whether each 
individual was antibody-positive, antibody-negative, 
and with which arenavirus it was associated. Voucher 
specimens and tissue samples previously deposited into 
the Natural Science Research Laboratory, Museum of 
Texas Tech University, formed the basis of this study. 
Specimen numbers, collection localities, virus type, and 
GenBank accession numbers are listed in the Appen¬ 
dix. For some outgroup and reference samples, DNA 
sequences were obtained from GenBank. 

Twenty-three individuals representing North 
American principal host species were selected to repre¬ 
sent antibody-positive and negative individuals. Efforts 
were made to include taxa associated with different 
arenavirus species and divergent strains. Taxa selected 
included Neotoma albigula (n = 4), N. floridana (n = 
1), N. fuscipes (n = 1), N. leucodon (n = 3), N. macro- 
tis (n = 1), N. mexicana (n = 2), N. micropus (n = 8), 
Peromyscus californicus (n = 2), and Sigmodon hispi- 
dus (n = 1). Taxa chosen to represent hosts of South 
American clade B viruses included Calomys callosus 
(n = 1), C. musculinus (n = 1), Neacomys spinosus 
(n= 1), and Zygodontomys brevicauda (n = 1). Four 
individuals were selected from two woodrat species 
not known to carry an arenavirus, N. lepida (n = 3) and 
N. stephensi (n = 1), and 10 individuals were chosen 
from non-host taxa including Baiomys taylori (n = 1), 
Cricetulus griseus (n = 1), Peromyscus leucopus (n = 
1), Reithrodontomys fulvescens (n = 1), Mus musculus 
(n = 1), Rattus norvegicus (n = 1), Equus caballos (n = 
1), Canisfamilaris (n = 1), Felis catus (n = 1), Artibeus 
jamiacensis (n = 1), and Homo sapiens (n = 1). 

RNA isolation and amplification. —To avoid in- 
trons and to focus on regions specifically identified by 
Radoshitzky et al. (2008), mRNA specific to the TfRl 
gene was examined in a subset of taxa (seven individu¬ 
als). Total RNA was isolated from frozen liver tissue 
using the RNeasy RNA isolation kit (Qiagen, Valencia, 
California). First-strand cDNA copies were generated 


from mRNA using reverse transcription (Superscript II 
reverse transcriptase, Invitrogen Life Technologies, Inc, 
Carlsbad, California) and an oligo (dT) 12-18 primer. A 
1,500 base region of the TfRl gene was amplified with 
primers generated from known sequences of C. cal¬ 
losus , C. musculinus , and Z. brevicauda (Radoshitzky 
et al. 2008). This nucleotide sequence, consisting of 
exons 4-17, encoded the ectodomain of the 7^7? 1 pro¬ 
tein, including the apical domain that interacts with the 
arenavirus GP1 protein. Also included in this region 
is the nucleotide sequence that encodes the amino acid 
loop corresponding to KTfRl 208-212, implicated by 
Radoshitzky et al. (2008) as being important in binding 
to the arenavirus GP1 protein. The polymerase chain 
reaction (PCR) method (Saiki et al. 1988) was used 
to amplify the first-strand cDNA. Primers TfRl-5’ 
(AC AACTAT GAT GGAT C AAGCC AGAT C AGC A) 
and TfRl-3 ? (ACAACTACATTTAAAACTCAT- 
TGTCAATATTCCAAATGTC) were used with the 
following step-down PCR protocol for primary am¬ 
plification: 2 cycles of 95°C (30s) denaturing, 60°C 
(45s) annealing, 72°C (2 min 30s) extension, 2 cycles 
of 95°C (30s) denaturing, 58°C (45s) annealing, 72°C 
(2 min 30s) extension, 2 cycles of 95°C (30s) denatur¬ 
ing, 56°C (45s) annealing, 72°C (2 min 30s) exten¬ 
sion, 2 cycles of 95°C (30s) denaturing, 53°C (45s) 
annealing, 72°C (2 min 30s) extension, 27 cycles of 
95°C (30s) denaturing, 49°C (45s) annealing, 72°C 
(2 min 30s) extension, followed by a final extension 
of 72°C (15 min). Following primary amplification, 
another PCR protocol was used with primers TfRl- 
73F (CTGGCTCGGCAAGTAGATGGAGATAA) 
and TfRl-1917R (GAACTGGTTCAGATCCTTCA- 
CAAATGACAG) for nested amplification: 35 cycles 
of 95°C (30s) denaturing, 48°C (45s) annealing, 72°C 
(1 min 30s) extension, followed by a final extension 
of 72°C (15 min). 

DNA isolation and amplification .—Based on 
the nucleotide sequences obtained from the RNA am¬ 
plifications, a 1,800 base region of the TfRl gene was 
amplified in the seven selected taxa that targeted the 
apical domain (regions surrounding amino acids 208- 
212). The amplified region included: 146 bp of exon 
5, intron 5, 108 bp of exon 6 (contains the nucleotide 
sequence that encodes amino acids 208-212), intron 6, 
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and 26 bp of exon 7, producing a total coding length 
of 280 bp. Genomic DNA was isolated from liver 
or muscle tissue using the Puregene DNA isolation 
kit (Gentra Systems, Minneapolis, Minnesota). PCR 
primers were constructed using Neotoma sequences 
generated from RNA (TfRl-440F: AGCTGAGCCA- 
GAATACAAATAC; TfRl-783R: CCCTGCTCTAA- 
CAATCACTATAGATCC). The following step-down 
PCR protocol was used for amplification: 2 cycles of 
95°C (30s) denaturing, 60°C (1 min) annealing, 72°C (1 
min 40s) extension, 2 cycles of 95°C (30s) denaturing, 
58°C (1 min) annealing, 72°C (1 min 40s) extension, 2 
cycles of 95°C (30s) denaturing, 56°C (1 min) anneal¬ 
ing, 72°C (1 min 40s) extension, 2 cycles of 95°C (30s) 
denaturing, 53°C (1 min) annealing, 72°C (1 min 40s) 
extension, 27 cycles of 95°C (30s) denaturing, 50°C 
(45s) annealing, 72°C (1 min 40s) extension, followed 
by a final extension of 72°C (15 min). 

PCR purification and sequencing. —PCR am- 
plicons were purified using ExoSAP-It (USB Corp., 
Cleveland, Ohio). All samples were sequenced on 
an ABI 'ilW-Avant automated sequencer using ABI 
Prism Big Dye Terminator v.3.1 (Applied Biosystems, 
Foster City, California). Sequences were proofed and 
aligned using Sequencher 4.8 (Gene Codes, Ann Arbor, 
Michigan) and MEGA4 (Tamura et al. 2007). DNA 
sequences generated in this study were deposited in 
GenBank. 

Data analysis. —For this study, the nucleotide se¬ 
quence dataset was truncated from the 280 bp region to 
a 42 bp region containing the apical domain and amino 
acids 200-212 of the TjRl gene that corresponded 
to the hypothesized binding sites. Pairwise genetic 
distance analyses were generated using the nucleotide 
data under the Jukes-Cantor model of evolution (Jukes 
and Cantor 1969). To obtain values for comparisons 
within and among species, the mean of the pairwise 


comparisons was obtained. Genetic distances were 
estimated within species and between species, between 
positive and negative individuals within a species, and 
between individuals of conspecifics associated with 
different arenaviruses. 

A maximum parsimony analysis was performed 
using PAUP*4.0 (Swofford 2002) and H. sapiens as 
the outgroup taxon. Nucleotide positions were treated 
as weighted, unordered, discrete characters with pos¬ 
sible states A, C, G, and T; heterozygous sites were 
designated using the accepted International Union 
of Biochemistry polymorphic code. Uninformative 
characters were excluded from analyses and optimal 
trees were estimated using the heuristic search method 
with tree bisection-reconnection branch swapping and 
stepwise addition sequence options. Bootstrap analysis 
(Felsenstein 1985) with 1,000 iterations was used to 
evaluate nodal support. In addition, nucleotide se¬ 
quences were translated into amino acids and analyzed 
as described previously. Phylogenetically informative 
sites between residues 200 and 220 that have undergone 
coding changes were plotted on branches to depict pat¬ 
terns of amino acid evolution. 

A Bayesian analysis was conducted using Mr- 
Bayes software (Huelsenbeck and Ronquist 2001) with 

H. sapiens as the outgroup. MODELTEST (Posada and 
Crandall 1998) identified the GTR+I+G model as the 
most appropriate model of evolution. TjRl sequences 
were partitioned by codon, four Markov chains were 
run for 10,000,000 generations, and trees sampled every 

I, 000 generations. The first 10,000 trees were dis¬ 
carded to insure that unstable trees were removed from 
analyses; remaining sampled trees were inputted into 
PAUP (Swofford 2002) and a majority-rule consensus 
tree was constructed. The analysis was repeated four 
times. Clade probability values >0.95 indicated nodal 
support and were included on the phylogenetic tree. 


Results 


Average distance values (Jukes and Cantor 1969) 
were estimated for several comparisons and are shown 
in Table 1. Distances between virus positive and vi¬ 
rus negative individuals within a species ranged from 
0.41% (N. leucodon ) to 0.49% (N. micropus). The 
average distance between individuals of N. micropus 


that hosted different arenaviruses was low, ranging 
from 0.00% (between hosts of the three Catarina virus 
strains) to 1.10% (between hosts of the Catarina virus 
and a Whitewater Arroyo-like virus). Species that 
hosted multiple viruses, N. albigula (0.41% between 
hosts of Big Brushy Tank and Tonto Creek viruses) and 
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Table 1. Average genetic distances (AGD), based on nucleotide sequences from the TfRl gene, for selected compari¬ 
sons of taxa examined in this study. Abbreviations are as follows: Bear Canyon Virus (BCNV); Skinner Tank Virus 
(SKTV); Tonto Creek Virus (TTCV); and White Water Arroyo Virus (WWAV). 


Comparison 

AGD 

Within Genera (includes virus positive and negative species) 

Neotoma 

1.58% 

Peromyscus 

4.95% 

Calomys 

4.18% 

Within Species (includes virus positive and negative species) 

N. albigula 

0.54% 

N. lepida 

0.69% 

N. leucodon 

0.55% 

N. mexicana 

0.00% 

N. micropus 

0.61% 

Between Virus Positive and Virus Negative Individuals within a Species 

N. micropus 

0.48% 

N. leucodon 

0.41% 

Between Hosts of BCNV 

N. fuscipes-N. macrotis 

0.41% 

N. fuscipes-P. californicus 

6.60% 

N. macrotis-P. californicus 

7.04% 

Between Hosts of Different Arenaviruses 

N. micropus (3 strains of CTNV) 

0.00% 

N. micropus (CTNV and WWA-like 1) 

1.10% 

N. micropus (WWA-like 1 and WWA-like 2) 

0.41% 

N. albigula (BBTV and TTCV) 

0.41% 

N. mexicana (SKTV and WWAV) 

0.00% 


N mexicana (0.00% between hosts of Skinner Tank 
and Whitewater Arroyo viruses), depicted low genetic 
distance values relative to other species. Similarly, 
individuals of a single species of Neotoma that hosted 
the same virus possessed low values, averaging 0.48%. 
Average distance values between hosts of the Bear 
Canyon virus ranged from 0.41% (N. fuscipes to N. 
macrotis) to 7.04% (N. fuscipes to P. californicus). The 
average genetic distance between species of Neotoma 


was 1.47% and was lower than average distances 
between species of Calomys (4.18%) and species of 
Peromyscus (4.95%). 

In the maximum parsimony analysis, a strict 
consensus tree (Fig. 1) was generated from 2,652 
most-parsimonious trees and the resulting topology 
was characterized by several unresolved clades with 
few synapomorphies positioned at terminal branches. 
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Homo sapiens several New World arenaviruses 
Equus caballos NH 

Can is familaris NH 

Felis catus NH 

Artibeus jamaicensis TACV 

Rattus norvegicus NH 

Mus musculus NH 

Peromyscus leucopus TK98042 NH 

Peromyscus californicus TK90599 BCNV 

Reithroclontomys fulvscens TK98003 negative 

Sigmodon hispidus TK98004 negative 

Baiomys taylori TK160739 NH 

Peromyscus californicus TK90438 BCNV 

N. fuscipes TK91001 BCNV 

K macrotis TK83707 BCNV 

N. lepida TK114615 NH 

N. lepida TK72818 NH 

N. lepida TK114600 NH 

N. stephensi TK113818 NH 

N. mexicana TK119202 SKTV 

N. mexicana TK123380 WWAV 

N. albigula TK93637 TTCV 

N. floridana TK160840 NH 

N. micropus TK137076 negative 

N. micropus TK137212 negative 

N. micropus TK84703 CTNV 

N. micropus TK84816 CTNV 

N. micropus TK84708 CTNV 

N leucodon TK1334485 RDCV 

N. leucodon TK1334925 negative 

N leucodon TK153006 negative 

K micropus TK28737 WWA-like 

K micropus TK77260 WWA-like 

N. albigula TK113981 TTCV 

N. albigula TK114533 BBTV 

N. micropus TK137078 negative 

N. albigula TK114581 BBTV 

Calomys callosus MACV 

Calomys musculinus JUNV 

Neacomys spinosus AMAV 

Zygodontomys brevicauda GTOV 

Cricetulus griseus NH 


Figure 1. Unweighted maximum parsimony tree obtained from analysis of TfR\ nucleotide sequence data. Numbers 
above branches represent amino acid positions that unite clades. Values below branches are bootstrap support values 
(only values > 70 are shown). Taxon labels include specimen identification number (TK) and virus status (NH) or 
virus acronym. Abbreviations for arenaviruses are in the Appendix. 
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Five major clades were well-supported; one contain¬ 
ing individuals of Equus, Canis , Felis , and Artibeus (I, 
bootstrap value = 73), one representing Rodentia (II, 
bootstrap value = 100), one including the individuals 
of Rattus and Mus (III, bootstrap value = 91), one 
representing the clade containing the South American 
hosts (IV, bootstrap value = 100), and a clade containing 
members of the genus Neotoma (V, bootstrap value = 
81). Within clade IV (hosts for virus clade B), Calo- 
mys callosus and C. musculinus , that possess different 
arenaviruses, displayed three amino acid substitutions 
within the 208-212 loop. Few individuals of the same 
species formed well-supported clades, although the 
two individuals of N. mexicana , one hosting SKTV 
and one hosting WWAV, formed a monophyletic group. 
Antibody-positive and negative individuals did not 
share any consistent amino acid, nor did hosts of dif¬ 
ferent viruses. 

The Bayesian analysis produced a tree (Fig. 2) 
that was similar in topology to that obtained from the 
parsimony analysis. Four clades were supported with 
clade probability values > 0.95. Clade I corresponded 
to the order Rodentia; clade II contained members of the 
family Muridae {Mus and Rattus ); clade III contained 


the hosts for the South American arenaviruses; and 
clade IV contained members of the genus Neotoma. 
Relationships among species and among individuals 
were characterized by the presence of short branch 
lengths and the accumulation of few nucleotide sub¬ 
stitutions. 

A pattern of sequence conservation (G - L Y L), 
corresponding to the putative binding site (residues 
208-212), was conserved among nearly all species of 
Neotoma , with the exception of three individuals that 
possessed a substitution at amino acid 210 (Table 2). 
This sequence was present in all woodrats, including 
N. lepida and N. stephensi , which based on available 
data, do not host arenaviruses. This same sequence also 
was present in species other than Neotoma , including 
North American hosts {R californicus ), South American 
clade B hosts {Neacomys spinosus and Z brevicauda ) 
and non-host species (B. taylori). Tyrosine-211, which 
has been suggested to be necessary for cellular entry 
of arenaviruses (Radoshitzky et al. 2008), was present 
in all North American hosts (both antibody-positive 
and negative individuals, with the exception of S. 
hispidus ), all South American clade B hosts, and the 
non-host B. taylori. 


Discussion 


Comparisons of genetic distances (Table 1) re¬ 
vealed low sequence divergence within and between 
species, with an average distance of 0.48% within 
species of Neotoma and 1.47% between species of 
Neotoma , suggesting a high level of conservation in 
the TfR\ gene. Distance values between positive and 
negative individuals within a species, as well as hosts of 
different viruses within a species, were comparable to 
overall distances between individuals within a species. 
Average distances between hosts of Bear Canyon virus 
{N. macrotis and N. fuscipes to P. californicus) were 
comparable to distances between Neotoma and Pero- 
myscus. Although N. micropus is known to host three 
divergent strains of Catarina virus in southern Texas 
(Fulhorst et al. 2002b), the degree of genetic divergence 
between these three virus strains averaged 12% and was 
not congruent with the level of genetic divergence of the 
TfRl genes of their respective hosts (0%). Similarly, 
hosts of Skinner Tank virus and Whitewater Arroyo 
virus (individuals of N. mexicana ) depicted a genetic 


divergence of 0.00%. In contrast, the genetic distance 
between C. callosus and C. musculinus (hosts of clade B 
viruses) was 4.18%, much higher than values obtained 
for comparisons within Neotoma. 

The phylogenetic analyses (Parsimony and 
Bayesian) depicted clades with low overall support and 
few synapomorphies (Figs. 1 and 2). The distribution 
of amino acid substitutions did not appear to deter¬ 
mine host specificity. For example, substitution 210 
is present in two individuals of N. albigula , one that 
hosts Tonto Creek virus and one that hosts Big Brushy 
Tank virus. However, two additional individuals of 
this species do not possess this substitution and yet are 
associated with these viruses. In contrast, in hosts of 
Bear Canyon virus, different amino acid substitutions 
occurred between different hosts of the same virus. 
For example, there are multiple substitutions that oc¬ 
cur along the lineages of N. micropus and N. fuscipes 
which do not occur in P. californicus and vice versa. 
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■ Homo sapiens several New World arenaviruses 

- Equus caballos NH 

- Can is familaris NH 

- Felis catus NH 


■ Artibeus jamaicensis TACV 


ii* 


i * 


■ 0.01 substitutions/site 


■Rattus norvegicus NH 


■ Mus musculus NH 


■ Peromyscus leucopus TK98042 NH 


Peromyscus californicus TK90599 BCNV 
- Reithrodontomys fulvescens TK98003 negative 


■ Sigmodon hispidus TK98004 negative 
r Baiomys taylori TK160739 NH 

— Peromyscus californicus TK90438 BCNV 
N.fuscipes TK91001 BCNV 
TV macrotis TK83707 BCNV 
TV. stephensi TK113818 NH 
L TV. lepida TK114615 NH 
TV. lepida TK72818 NH 
TV. lepida TK114600 NH 


IV 




TV albigula TK93637 TTCV 
TV. leucodon TK133448 RDCV 
TV. leucodon TK133492 negative 
TV. micropus TK84703 CTNV 
TV. micropus TK84816 CTNV 
TV. micropus TK84708 CTNV 
TV. micropus TK137076 negative 
-TV. floridana TK160840 negative 
TV. albigula TK113981 TTCV 
H TV micropus TK137078 negative 
-| TV. albigula TK114581 BBTV 
TV. albigula TK114533 BBTV 
TV. micropus TK137212 negative 
TV. micropus TK77260 WWA-like 
TV. leucodon TK153006 negative 
TV. micropus TK28731 WWA-like 
TV. mexicana TK119202 SKTV 




H 


TV. mexicana TK123380 WWAV 
■ Calomys callosus MACV 


in 


£ 


Calomys musculinus JUNV 
- Zygodontomys brevicauda GTOV 
■ Neacomys spinosus AMAV 


■ Cricetulus griseus NH 


Figure 2. Maximum likelihood tree obtained from analysis of TfRl nucleotide sequence data. Posterior probability 
values > 0.95 are shown (*) above branches and major clades are depicted by Roman numerals. Taxon labels include 
specimen identification number (TK) and virus status (NH) or virus acronym. Abbreviations for arenaviruses are in 
the Appendix. 
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Table 2. Amino acid differences in the TfRl gene at residues 204-212for North American hosts, species not known 
to host a New World arenavirus, and South American clade B hosts. Sample sizes for each taxon are indicated ' 
parentheses. Abbreviations for amino acids are as follows: A (Alanine), D (Aspartic Acid), E (Glutamic acid), 
(Glycine), I (Isoleucine), K (Lysine), L (Leucine), M(Methionine), N (Asparagine), P (Proline), R (Arginine), S (Serine), 
T (Threonine), V (Valine), and Y (Tyrosine). 






Amino Acid Residue 




Taxon 

204 

205 

206 

207 

208 

209 

210 

211 

212 

North American Hosts 

Neotoma spp. (8 species, n = 15) 

N 

A 

s 

G 

G 

- 

L 

Y 

L 

Neotoma spp. (2 species, n = 4) 

N 

A 

s 

G 

G 

- 

S 

Y 

L 

Neotoma mexicana (n = 2) 

N 

E 

s 

G 

G 

- 

L 

Y 

L 

Peromyscus californicus (n = 1) 

N 

A 

s 

G 

G 

- 

L 

Y 

L 

Peromyscus californicus (n = 1) 

N 

E 

N 

G 

A 

- 

L 

Y 

L 

Sigmodon hispidus (n = 1) 

N 

D 

N 

G 

G 

- 

L 

N 

L 

North American Non-hosts 

Baiomys taylori (n = 1) 

N 

A 

N 

G 

G 

- 

L 

Y 

L 

Neotoma lepida (n = 3) 

N 

A 

S 

G 

G 

- 

L 

Y 

L 

Neotoma stephensi (n = 1) 

N 

A 

s 

G 

G 

- 

L 

Y 

L 

Peromyscus leucopus (n = 1) 

N 

E 

s 

G 

T 

- 

L 

Y 

L 

Reithrodontomys fulvescens (n = 1) 

N 

D 

N 

G 

G 

- 

L 

N 

L 

South American Clade B Hosts 

Calomys callosus (n = 1) 

N 

A 

s 

N 

G 

- 

V 

Y 

L 

Calomys musculinus (n = 1) 

N 

A 

s 

G 

G 

- 

S 

Y 

P 

Neacomys spinosus (n = 1) 

N 

S 

s 

G 

G 

- 

L 

Y 

L 

Zygodontomys brevicauda (n = 1) 

N 

T 

s 

G 

G 

- 

L 

Y 

L 

Reference Non-hosts 

Cricetulus griseus (n = 1) 

N 

V 

N 

G 

D 

- 

s 

D 

L 

Rattus norvegicus (n = 1) 

- 

N 

s 

G 

S 

N 

s 

D 

P 

Mus muse ulus (n = 1) 

Q 

S 

N 

G 

N 

- 

L 

D 

P 

Felis catus (n = 1) 

G 

T 

N 

S 

G 

M 

V 

Y 

L 

Canis familiaris (n = 1) 

D 

M 

E 

S 

D 

L 

V 

Y 

L 

Equus caballos (n = 1) 

N 

G 

S 

G 

D 

M 

I 

S 

L 

Artibeus jamaicensis (n = 1) 

A 

V 

s 

S 

G 

A 

G 

Y 

L 

Homo sapiens (n = 1) 

D 

K 

N 

G 

R 

L 

V 

Y 

L 


Q S' 
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but there are no substitutions shared exclusively by the 
hosts of Bear Canyon virus. Moreover, there are no 
amino acid substitutions that differentiate antibody¬ 
negative individuals within a species from those that 
are antibody-positive. 

For amino acids 208-212, the sequence G - L 
Y L was highly conserved (Table 2). Twenty-one of 
24 individuals of Neotoma possessed this sequence, 
including hosts of numerous North American viruses 
(antibody-positive and negative) and species not known 
to host an arenavirus. Although Radoshitzky et al. 
(2008) identified this region as a key determinant 
of host specificity of the clade B viruses, the highly 
conserved nature of this region in Neotoma and Pero- 
myscus and homology of this region to Z brevicauda 
and Neacomys spinosus suggests this may not be true 
for North American viruses. 

Despite the lack of genetic variation at amino 
acids 208-212 of the TfR\ gene in North American 
hosts, specific host-virus relationships are maintained. 
One possibility is that this particular region of the api¬ 
cal domain is not the only region that interacts with 
the arenavirus GP1. The tertiary structure of TfRl 
places this amino acid loop near the a-helix and two 
P-strands, yet these structures are not continuous along 
the primary sequence (Lawrence et al. 1999), and it 
cannot be ruled out that there are other amino acids 
that may be interacting with GP1. In fact, it has been 
suggested that a single change at amino acid 348, which 
is located in an a-helix adjacent to residues 208-212 
in the tertiary structure, can interfere with the entry of 
clade B Machupo and Guanarito viruses (Radoshitzky 
et al. 2008). There may be a similar mechanism in 
the North American viruses, in which other residues 
adjacent to this loop in tertiary structure may affect 
the conformation of the TfRl protein in such a way to 
alter host specificity. In addition, amino acid changes 
elsewhere in the protein, perhaps not in the apical 
domain or near the GP1 binding site, may affect the 
tertiary structure indirectly in such a way that impacts 
the conformation of the binding site. 

Although every host species (except S. hispidus) 
contains a tyrosine at position 211, there are species (N. 
lepida , N. Stephens!, and B. taylori) that have a tyrosine 
at this site, that are not known to host an arenavirus. 
Although tyrosine-211 may be necessary for arenavirus 
entry, it alone may not be sufficient. Post-translational 


modifications at the amino acid loop interact with GP1 
and play a role in the efficiency of arenavirus entry. 
Radoshitzky et al. (2008) identified anN-glycosylation 
site at residue 205 of C. callosus , C. muscnlmus, and Z. 
brevicauda , hosts of Machupo virus, Junin virus, and 
Guanarito virus, respectively. Removal of this glyco- 
sylation motif increases the efficiency of these three 
viruses in entering the host cells of each of the three host 
species (Radoshitzky et al. 2008). a-dystroglycan, the 
receptor used by Old World and clade C arenaviruses, 
is subject to many post-translational modifications that 
are necessary for the cellular function of this enzyme 
(Barresi and Campbell 2006) and coincidentally are 
necessary for its function as a receptor for arenaviruses 
(Kunz et al. 2005b). Interestingly, the residue at the 205 
site was not conserved in all members of the Cricetidae 
examined; however, site 204 was conserved. TfRl con¬ 
tains many potential glycosylation and phosphorylation 
sites that are necessary for proper functioning (Evans 
and Kemp 1997), and if there are similar glycosyla¬ 
tion motifs near or at the binding site, this may be an 
additional mechanism of maintaining host-specificity 
and virus entry. 

Sigmodon hispidus differed from all other New 
World hosts by the absence of a tyrosine at residue 
211, although this residue also was present in the 
non-arenavirus associated host Reithrodontomys ful- 
vescens. It is important to note that there is genetic 
differentiation between eastern and western lineages 
of S. hispidus based on AFLP and mitochondrial data 
(Phillips et al. 2007; Henson and Bradley 2009). The 
individual examined herein was collected within the 
range of the western lineage, but the Tamiami virus is 
known to occur only within the range of the eastern 
genetic lineage. Future studies should include indi¬ 
viduals from southern Florida, within the range of the 
Tamiami virus, for comparison. 

There were no consistent differences in TfRl se¬ 
quences among comparisons made in this study. This 
was surprising given the high level of genetic diver¬ 
gences in the GP1 protein (Inizan et al. 2010; Cajimat 
et al. in press) and suggests that host cell entry may not 
be determined by TfRl but may involve other unknown 
proteins as suggested by Flanagan et al. (2008). Al¬ 
ternatively, it may be that differential expression of a 
specific protein(s) may play a role determining infection 
and host-specificity (Tayeh et al. 2010). 
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Appendix 

Specimens in which the TfR 1 gene was sequenced are listed below by GenBank accession number, museum 
number, and associated arenavirus. Some '///^ 1 sequences were obtained from GenBank (sequences generated in 
previous studies). Abbreviations for New World arenaviruses are as follows: AMAV (Amapari Virus); BCNV 
(Bear Canyon Virus); BBTV (Big Brushy Tank Virus); CTNV (Catarina Virus); GTOV Guanarito Virus); JUNV 
(Junin Virus); MACV (Machupo Vims); RDCV (Real de Catorce Virus); SKTV (Skinner Tank Virus); TACV 
(Tacaribe Vims); TTCV (Tonto Creek Vims); and WWAV (White Water Arroyo Virus). All localities are in the 
United States unless otherwise noted. 

Baiomys taylori .—Texas: Motley County, 1 mi S Flomot (HM044878, TTU109274, negative). 

Neotoma albigula .—Arizona: Gila County, White Cow Mine (HM044879, TTU97148, TTCV); Yavapai 
County, Cherry Creek (HM044880, TTU88387, TTCV); Gila County, Brushy Tank (HM044882, TTU99846, 
BBTV); and Hackberry Creek (HM044881, TTU99895, BBTV). 

Neotoma floridana .—Oklahoma: Blaine County, 2.9 mi S Entrance of Big Bend Rec. Campgrounds 
(HM044883, TTU109275, negative). 

Neotoma fuscipes .—California: Los Angeles County, Zuma Canyon (HM044884, TTU83037, BCNV). 

Neotoma lepida .—California: Los Angeles County, West Covina, Galser Wilderness Park (HM044885, 
TTU88082, negative) and Arizona: Mohave County, Cottonwood Canyon (HM044886, TTU99919, negative; 
HM044887, TTU99934, negative). 

Neotoma leucodon. —MEXICO: San Luis Potosi, 22.8 km N Real de Catorce (HM044888, TTU102969, 
RDCV); Nuevo Leon, 8.7 km W Doctor Arroyo (HM044889, TTU109270, negative); and Oklahoma: Cimmaron 
County, Black Mesa State Park (HM044890, TTU109277, WWA-like). 

Neotoma macrotis .—California: Riverside County, Rancho Capistrano Ortega Mountains (HM044891, 
TTU81391, BCNV). 

Neotoma mexicana. —Arizona: Coconino County, Skinner Tank (HM044892, TTU100791, SKTV); 
and Colorado: Larimer County, Sylan Dale Guest Ranch near mouth of Big Thompson Canyon (HM044893, 
TTU 107426, WWAV). 

Neotoma micropus .—Oklahoma: Cimmaron County, 1.5 mi S, 3 mi E Kenton (HM044894, TTU43296, 
WWA-like); New Mexico: Otero County, Fort Bliss (HM044895, TTU79086, WWA-like); and Texas: Dimmitt 
County, Chaparral Wildlife Management Area (HM044896, TTU80915, CTNV; HM044897, TTU80920, CTNV; 
HM044898, TTU81029, CTNV). 

Neotoma micropus. —Texas: Motley County, 1 mi S Flomot (HM044899, TTU109271, negative; HM044900, 
TTU109272, negative; HM044901, TTU109273, negative). 

Neotoma stephensi .—Arizona: Yavapai County, Pine Flat (HM044902, TTU97595, negative). 

Peromyscus californicus .—California: Riverside County, Bear Canyon Trailhead (HM044903, TTU83520, 
BCNV) and Orange County, 2.4 mi NW El Cariso Ranger Station and Ortega Hwy (HM044904, TTU83562, 
BCNV). 
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Appendix (cont.) 

Peromyscus leucopus .—Texas: Dimmitt County, Chaparral Wildlife Management Area (HM044905, 
TTU98086, negative). 

Reithrodontomys fulvescens .—Texas: Dimmitt County, Chaparral Wildlife Management Area (HM044906, 
TTU98091, negative). 

Sigmodon hispidus. —Texas: Dimmitt County, Chaparral Wildlife Management Area (HM044907, TTU9892, 
negative). 

TfRl sequences obtained from GenBank. Viruses affiliated with taxon are in parentheses following the 
GenBank number: Artibeus jamaicensis (FJ154605, TACV); Calomys callosus (EU164540, MACV); Calomys 
musculinus (EU164541, JUNV); Cams familaris (NMOO1003111, none); Cricetulus griseus (LI9142, none); 
Equus caballos (DQ284764, none); Felis catus (AF276984, none); Homo sapiens (NM003234, several); Mus 
musculus (AK088961, none); Neacomys spinosus (FJ154604, AMAV); Rattus nomegicus (NM022722, none); 
and Zygodontomys brevicauda (EU340259, GTOV). 


